System and method for natural language operation of multifunction peripherals

ABSTRACT

A system and method for natural language-based multifunction peripheral control includes sensing when a portable data device is proximate to a MFP. A status of the MFP is monitored and user-specific configuration information is stored. The system receives activity data corresponding to performance of a preselected activity by a user and initiates a natural language exchange with a user of the portable data device in accordance with a monitored status of the multifunction peripheral and stored user-specific configuration settings. Document processing instructions received via the natural language exchange generate a natural language response. A second document processing instruction is then received via the natural language exchange responsive to the natural language response and a document processing operation is performed in accordance with the second document processing instruction.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a Divisional of application Ser. No. 16/119,165filed on Aug. 31, 2018, which claims the benefit of U.S. ProvisionalApplication No. 62/584,475 filed Nov. 10, 2017, each of which is hereinincorporated by reference in its entirety.

TECHNICAL FIELD

This application relates generally to voice assisted control of documentprocessing device operation. The application relates more particularlyto a natural language dialog between a user and a multifunctionperipheral using a portable data device, such as a smartphone, as averbal or touchscreen interface.

BACKGROUND

Document processing devices include printers, copiers, scanners ande-mail gateways. More recently, devices employing two or more of thesefunctions are found in office environments. These devices are referredto as multifunction peripherals (MFPs) or multifunction devices (MFDs).As used herein, MFPs are understood to comprise printers, alone or incombination with other of the afore-noted functions. It is furtherunderstood that any suitable document processing device can be used.

Currently most MFP or other office devices are driven by a userinterface such as a touch panel or button panel. In an effort to be morecompliant with disabilities acts, some devices have become moresensitive to other communities by offering voice assisted userinterfaces. However, this solution is expensive in terms of developmentand deployment and is not easily customized or tailored to an individualuser's preferences.

BRIEF DESCRIPTION OF THE DRAWINGS

Various embodiments will become better understood with regard to thefollowing description, appended claims and accompanying drawingswherein:

FIG. 1 an example embodiment of a natural language document processingoperation system;

FIG. 2 is an example embodiment of a document processing device;

FIG. 3 is an example embodiment of a portable digital device;

FIG. 4 is an example embodiment of voice assisted document processingoperations;

FIG. 5 is an example embodiment of a natural language dialog control ofa document processing operation;

FIG. 6 is another example embodiment of a flow diagram showing deviceand human interaction for natural language controlled documentprocessing operations;

FIG. 7 is another example embodiment of a flow diagram showing deviceand human interaction for natural language controlled documentprocessing operations;

FIG. 8 is another example embodiment of a flow diagram showing deviceand human interaction for natural language controlled documentprocessing operations;

FIG. 9 is an example embodiment of example screenshots of a mobile app;

FIG. 10 is an example embodiment of name and password entry on a mobileapp; and

FIG. 11 is an example embodiment of a natural language dialog asreflected on a display;

FIG. 12 is an example embodiment of a device/user interaction scenario;

FIG. 13 is another example embodiment of a device/user interactionscenario; and

FIG. 14 is an example embodiment of keywords for device/user interactionscenarios.

DETAILED DESCRIPTION

The systems and methods disclosed herein are described in detail by wayof examples and with reference to the figures. It will be appreciatedthat modifications to disclosed and described examples, arrangements,configurations, components, elements, apparatuses, devices methods,systems, etc. can suitably be made and may be desired for a specificapplication. In this disclosure, any identification of specifictechniques, arrangements, etc. are either related to a specific examplepresented or are merely a general description of such a technique,arrangement, etc. Identifications of specific details or examples arenot intended to be, and should not be, construed as mandatory orlimiting unless specifically designated as such.

In an example embodiment disclosed herein as system and method fornatural language-based multifunction peripheral control includes sensingwhen a portable data device is proximate to an MFP. A status of the MFPis monitored and user-specific configuration information is stored. Thesystem receives activity data corresponding to performance of apreselected activity by a user and initiates a natural language exchangewith a user of the portable data device in accordance with a monitoredstatus of the multifunction peripheral and stored user-specificconfiguration settings. Document processing instructions received viathe natural language exchange generate a natural langue response. Asecond document processing instruction is then received via the naturallanguage exchange responsive to the natural language response and adocument processing operation is performed in accordance with the seconddocument processing instruction.

Currently most MFP or other office devices are driven by a userinterface such as a touch panel or button panel. In an effort to be morecompliant with section 508 of the Rehabilitation Act and promote ease ofuse for people with (or without) disabilities, many devices offer voiceassisted User Interfaces. Successful implementation can difficult andcostly and may not be compatible with devices currently in the field. Aswill be detailed further below, the subject application includes exampleembodiments wherein a mobile data device, such as a smartphone, tabletcomputer notebook computer, smart watch or the like is used tocommunicate wirelessly with an office device such as MFP (via Bluetooth,NFC, Wi-Fi, etc.) to provide the user with a natural language userinterface to accomplish device tasks. The provided voice input and voiceresponse makes use of natural language, a menu driven “wizard”intelligent system, stored user preferences, and responses to both voiceand physical inputs including recognition of paper in the paper tray toinitiate a task. Provisioning of a mobile device application (“app”)that can translate device capabilities into a series of natural languageprompts and similarly translate user responses into computer commandsunderstood by the MFP, creates a more accessible user interface forinteracting with MFP devices. As used herein, natural language dialogincludes any suitable device-user language communication, such as withboth user and device speaking, with the user speaking and the devicereplying in characters, the user supplying character input and thedevice speaking, or both the user and the device communicating viacharacters. Responses may also be suitably supplied by users by deviceinteraction, such as tapping a “yes” or “no” button displayed on atouchscreen or pressing one or more keys on an MFP or its display.

The system can include a mobile app that communicates with an officedevice such as MFP to provide the user with a natural language voiceinteraction user interface to accomplish tasks, physical prompts or menudriven selections.

The user interface and accompanying software recognize voice input andrespond with voice menu commands, including:

-   -   English commands    -   Japanese or other foreign language commands    -   Provision of visual feedback on the MFP or device including:    -   Communicating questions    -   Listening    -   User response options    -   Translating users' voice to text    -   Device response and confirmation

Example embodiments herein describe an application and a system thatinteracts with a hardware device such as an MFP to provide users with anatural language user interface. The MFP app is opened or initiated whenthe user enters a proximity threshold, as defined by a beacon, or whenit is invoked by the user via touch or voice activation.

Wireless communication is established between the app and MFP eitheroptically or via radio frequency by either proximity, user preferences,barcode, Wi-Fi, Wi-Fi direct, QR code scan, or NFC, or the like. Oncecommunication is established, user preferences and historicalinformation is retrieved from the app; the app also queries the devicefor device capabilities.

Task invocation is suitably initiated either automatically by placingpaper in an automatic document feeder, or on the glass, or by initiatinga conversation, for example “Hey Moppy.”

A series of voice prompts are sent from the app to user and responsesare collected by the device to configure an MFP task. A wizard-likeapproach is used in that subsequent prompts sent to the user are basedon previous responses in an attempt to efficiently communicate toacquire job details. The user's language is translated to computercommands understood by the device. Once the job is configured on theapp, it is sent to the device for processing.

An example mobile application comprises a client app executing on amobile device (for example iOS). The mobile app uses natural language toconvert voice commands to copy MFP commands to initiate MFP Copy Tasksor other tasks such as Copy, Print Release, Scan to email and otherfunctions.

Embodiments herein include two basic systems.

A client side mobile app.

The client side mobile app listens to the user's voice, translates textlocally, shows the conversation on the UI, or user interface, and sendsthe text to the server.

Client side can use iOS Siri or any other suitable voice recognition.

Running on the MFP is a background app.

The app resides on the MFP and receives the text strings from the clientapp. The app parses the text and converts the text to print commands.

A dictionary resides on the app server (HTTP Rest Server) that allowsnatural communication by accepting a variety of phrases for a copycommand. For example, both terms “duplex” and “2—sided” can berecognized as a copy command for printing on both sides of the page.

A registered user can login to the MFP from the app that is optionallyfingerprint enabled. This would allow the user to interact with deviceswith authentication required, allow any user preferences to betransferred to the job itself, and allow the user to release held printjobs using natural language.

FIG. 1 illustrates an example embodiment of a natural language operatedsystem 100 including one or more MFPs, such as MFP 104. MFP 104 issuitably connected to network 108 by any suitable wired or wireless datapath. Network 108 is suitably comprised of a local area network (LAN),wide area network (WAN), which may comprise the Internet, or anysuitable combination thereof. MFP is suitably provided with an abilityfor wireless communication with portable data devices such as smartphone112. Communication is suitably via Wi-Fi, including Wi-Fi direct, vianear field communication (NFC), Bluetooth, or the like.

User 116 in possession of smartphone 112 approaches MFP 104. Proximitydetermined by any suitable means, including from Bluetooth beacon 120,NFC interface 124, or by detection of a marking on MFP 104, such as QRcode 128. Smartphone 112 is running an interface app, and stores theuser's document processing preferences and preferred language. When auser 116 is sufficiently proximate, they may initiate a natural languagecommunication setting between MFP 104 and smartphone 112 by matching apreset pattern. Initiation is suitably triggered by distance, or bymechanical interaction of the user 116, such as pressing on theirsmartphone 112 touchscreen or MFP 104 user interface. Other mechanicalinteractions that may trigger a session include open a document feeder124 on MFP 104 as indicated or by placing a document on a scannerplaten. Initiation is also suitably commenced by uttering a wakeupphrase to smartphone 112 while running the app including a feature tolisten continuously for such a phrase while running, including runningin the background of other concurrent apps. Natural language input issuitably converted to text via a processor on MFP 104, a processor onsmartphone 112, or a combination thereof. Processing and text-to-speechconversion is also suitably done by a networked language processor 132,suitably operable to receive a digital voice file and return acorresponding text file, thus eliminating dedicated hardware or softwareto provide such conversion. Document processing operations are thencompleted by a natural language dialog as will be detailed furtherbelow.

Turning now to FIG. 2 illustrated is an example embodiment of a MFPdevice comprised of a document rendering system 200 suitably comprisedwithin an MFP, such as with MFP 104 of FIG. 1. Included in intelligentcontroller 201 are one or more processors, such as that illustrated byprocessor 202. Each processor is suitably associated with non-volatilememory, such as ROM 204, and random access memory (RAM) 206, via a databus 212.

Processor 202 is also in data communication with a storage interface 208for reading or writing to a storage 216, suitably comprised of a harddisk, optical disk, solid-state disk, cloud-based storage, or any othersuitable data storage as will be appreciated by one of ordinary skill inthe art.

Processor 202 is also in data communication with a network interface 210which provides an interface to a network interface controller (NIC) 214,which in turn provides a data path to any suitable wired or physicalnetwork connection 220, or to a wireless data connection via wirelessnetwork interface 218. Example wireless connections include cellular,Wi-Fi, Bluetooth, NFC, wireless universal serial bus (wireless USB),satellite, and the like. Example wired interfaces include Ethernet, USB,IEEE 1394 (FireWire), Lightning, telephone line, or the like. Processor202 is also in data communication with one or more sensors which providedata relative to a state of the device or associated surroundings, suchas device temperature, ambient temperature, humidity, device movementand the like.

Processor 202 can also be in data communication with any suitable userinput/output (I/O) interface 219 which provides data communication withuser peripherals, such as displays, keyboards, mice, track balls, touchscreens, or the like. Hardware monitors suitably provides device eventdata, working in concert with suitable monitoring systems. By way offurther example, monitoring systems may include page counters, sensoroutput, such as consumable level sensors, temperature sensors, powerquality sensors, device error sensors, door open sensors, and the like.Data is suitably stored in one or more device logs, such as in storage216 of FIG. 2.

Also in data communication with data bus 212 is a document processorinterface 222 suitable for data communication with MFP functional units250. In the illustrated example, these units include copy hardware 240,scan hardware 242, print hardware 244 and fax hardware 246 whichtogether comprise MFP functional hardware 250. It will be understoodthat functional units are suitably comprised of intelligent units,including any suitable hardware or software platform.

Intelligent controller 201 is suitably provided with an embedded webserver system for device configuration and administration. A suitableweb interface is comprised of TOPACCESS Controller (sometimes referredto in the subject illustrations as “TA”), available from Toshiba TECCorporation.

Turning now to FIG. 3, illustrated is an example embodiment of asuitable portable digital device 300 such a smartphone 112 of FIG. 1.Included are one or more processors, such as that illustrated byprocessor 310. Each processor is suitably associated with non-volatilememory, such as read only memory (ROM) 312 and random access memory(RAM) 314, via a data bus 318.

Processor 310 is also in data communication with a storage interface 325for reading or writing to a data storage system 316, suitably comprisedof a hard disk, optical disk, solid-state disk, or any other suitabledata storage as will be appreciated by one of ordinary skill in the art.

Processor 310 is also in data communication with a network interfacecontroller (NIC) 330, which provides a data path to any suitable wiredor physical network connection via physical network interface 334, or toany suitable wireless data connection via wireless interface 332, suchas one or more of the networks detailed above. The system suitably useslocation based services. By way of example, if multiple error eventmanagement systems are used, it may be advantageous to have monitoringof devices completed by a local or more proximate event managementsystem.

Processor 310 is also in data communication with a user input/output(I/O) interface 350 which provides data communication with userperipherals, such as display 360, as well as keyboards 352, mice, trackballs, or other pointing devices 354, touch screen 370, or the like. Itwill be understood that functional units are suitably comprised ofintelligent units, including any suitable hardware or software platform.

FIG. 4 is a flowchart 400 of an example embodiment of voice assisteddocument processing operations. The process commences at block 404, andproceeds to block 408 when an app, such as a smartphone app, establishesconnection with an MFP. The user interacts with a device or app at block412 to invoke a document processing task. Device communication optionsbetween the MFP and smartphone are selectable by the user at block 416.As noted above, any suitable wireless protocol can be used, examples forcommunication included Wi-Fi, Bluetooth, NFC, optical, cellular, and thelike. A communication session is suitably accomplished in accordancewith preselected user preferences and language settings from block 420.

A natural language dialog is engaged at block 424, with back-and-forthcommunication as needed to set the user's desired document processingoperation. Appropriate commands are sent to the MFP at block 428. If thetask is determined to be complete at block 432, the process suitablyends. If not, the process returns to block 416.

FIG. 5 is a flowchart 500 of an example embodiment of a natural languagedialog control of a document processing operation. The process commencesat block 504 and proceeds to block 508 until a Bluetooth connection ismade. If so, a determination is made at block 512 as to whether adocument has been loaded into an MFP's automated document feeder orplaced on its platen for copying. If so, the user is determined todesire to copy their document. If a single copy is chosen at block 516,confirmation is provided at block 520, and the user is asked to verifythis at block 524. If the user does not, progress is made to block 528where the user can specify a number of copies desired. If a single copyis not selected at block 516, progress is made directly to block 528where the user is prompted to specify a number of copies desired whichis received at block 532. Confirmation of the selected number is statedat block 536 and confirmation solicited at block 540. If not confirmed,progress returns to block 528. If confirmed, a copy is initiated atblock 544. If a single copy was confirmed at block 524, progress isdirectly to block 544.

Next, the user is asked whether stapling is desired at block 548. Thisis suitably bypassed if only a single page is being copied. If staplingis selected, pages are stapled at block 552 and confirmation is statedto the user at block 556, suitably with an admonition to remove theiroriginal. If stapling was not selected at block 548, progress isdirectly to block 556. Once paper has been removed as determined atblock 560, the system suitably returns to block 508 for a continued ornew Bluetooth connection.

FIGS. 6-8 depict flow diagrams of example embodiments of device andhuman interaction for natural language controlled document processingoperations.

Referring also to FIG. 9, example screenshots of a mobile app areillustrated.

-   -   The home screen (left) shows instruction on how to initiate        conversation.    -   Tapping the gray microphone invokes listening    -   When the client is listening, the microphone is blue and the        spinny indicator is shown. This is when user should speak.    -   When the copy task is finished, the conversation bubbles are        cleared and Home screen is shown.    -   When a “Stop” command is invoked, the conversation bubbles are        cleared and Home screen is shown.

The use can invoke a command as follows:

-   -   By pressing the microphone button at the bottom of the screen.    -   By saying “Hey Moppy”, or “Hey Jackie”, or another suitable        name. Because the system is “listening” all the time, the client        may erroneously respond to ambient chatter if a common name is        used.

Compound commands can contain three or more keyword commands, forexample:

-   -   Please make one two-sided copy and staple it    -   Please make a copy that is two—sided (you will be prompted for        staple)    -   One stapled copy please (you will be prompted for sided)    -   A stop command (for example, “STOP”) can cancel the conversation        and stop voice recognition from listening. The stop command can        clear the screen and show the initial instruction to the user.

When the mobile app is first executed, a user can configured initialsettings, which can be changed later by tapping the setting button.Settings may be particular to a user or particular to a user's typicalfor a user's document processing needs. The app suitably asks for a username and password, for example as illustrated in FIG. 10, and enables afinger registration option. Configuration settings suitably include:

-   -   Nick Name The Name you want the Toshiba Copy Talk to call you        using voice. E.g., “Rashmi”. You may have to spell phonetically        for example “Rashmee”    -   User Name to allow MFP authentication    -   Password to allow MFP authentication    -   Device IP Address—allows connectivity to MFP Toshiba Copy Talk        app    -   English/Japanese Voice recognition and Voice Response    -   Touch ID—allows Toshiba Copy Talk to use Fingerprint access    -   Keyword: On/Off—allows Toshiba Copy Talk to listen for keywords    -   Timeout—allows Toshiba Copy Talk to stop listening

The system suitably identifies certain operations and invokesappropriate conversations. For example, the system can prompt the userwith a staple option, but only if the job requires two or more pages.

FIG. 11 is a flow diagram 1100 showing natural language dialog as it issuitably reflected on a display or touchscreen of a mobile digitaldevice, such as a tablet or smartphone.

FIGS. 12-13 illustrate example embodiments of additional device/userinteraction scenarios.

FIG. 14 illustrates an example embodiment of a suitable relationshiptable between language key words, optional key words and multiplelanguages for user-device interaction.

While certain embodiments have been described, these embodiments havebeen presented by way of example only, and are not intended to limit thescope of the inventions. Indeed, the novel embodiments described hereinmay be embodied in a variety of other forms; furthermore, variousomissions, substitutions and changes in the form of the embodimentsdescribed herein may be made without departing from the spirit of theinventions. The accompanying claims and their equivalents are intendedto cover such forms or modifications as would fall within the spirit andscope of the inventions.

What is claimed is:
 1. A system comprising: a multifunction peripheralincluding an intelligent controller having processor and associatedmemory, a data interface configured for data communication with a mobiledata device, a document processing engine configured to perform adocument processing operation under control of the processor inaccordance with a device operation instruction received from the mobiledata device, wherein the processor is configured to generate devicestatus data corresponding to a monitored status of the multifunctionperipheral; and a mobile data device including a processor andassociated memory, the memory storing user-specific settings data, thememory further storing pattern data corresponding to at least onepreselected data pattern, a data interface configured for datacommunication with the multifunction peripheral, a natural languageinput, a natural language output, and a touchscreen display, wherein theprocessor is configured to determine when the mobile device is proximateto the multifunction peripheral, wherein the processor is furtherconfigured to communicate data with the multifunction peripheral whenthe mobile data device is proximate to the multifunction peripheral,wherein the processor is further configured to receive status data fromthe multifunction peripheral via the data interface, wherein theprocessor is further configured for ongoing monitoring of input from auser to the multifunction peripheral or the mobile device, wherein theprocessor is further configured for an ongoing comparison of user inputrelative to the pattern data, wherein the processor is furtherconfigured determine a match between user input and a pattern containedin the pattern data, wherein the processor is further configured toinitiate a natural language dialog with the user in accordance withreceived status data and user-specific settings data when the mobiledata device is proximate to the multifunction peripheral and a matchbetween the user input and the pattern data is determined, wherein theprocessor is further configured to receive a document processinginstruction from the user via the natural language dialog with the user,and wherein the processor is further configured to communicate anoperational instruction to the multifunction peripheral via the datainterface of the mobile data device in accordance with a receiveddocument processing instruction.
 2. The system of claim 1 wherein themobile data device processor is further configured to determine when themobile data device is proximate to the multifunction peripheral inaccordance with a signal received from an associated beacon.
 3. Thesystem of claim 1 wherein the mobile data device processor is furtherconfigured to monitor the user input comprised of voice input from theuser.
 4. The system of claim 1 wherein the mobile data device processoris further configured to monitor the user input comprised of tactileuser interaction with the multifunction peripheral.
 5. The system ofclaim 1 wherein the status data includes a level of consumables in themultifunction peripheral.
 6. The system of claim 1 wherein the statusdata includes document processing capabilities of the multifunctionperipheral.
 7. The system of claim 1 wherein the user-specific settingsinclude preselected document processing operation settings associatedwith the user.
 8. The system of claim 1 wherein the mobile data deviceprocessor is further configured to generate the operational instructionin accordance with additional user input from a sequence of dataexchanges supplied in the natural language dialog.
 9. A methodcomprising: storing, in a mobile data device including a processor andassociated memory, user-specific settings data; storing, in the mobiledata device, pattern data corresponding to at least one preselected datapattern; receiving, by the mobile data device, natural language inputfrom a user, determining when the mobile device is proximate to amultifunction peripheral; communicating, by the mobile data device, datawith the multifunction peripheral when the mobile data device isproximate to the multifunction peripheral; receiving, by the mobile datadevice, status data from the multifunction peripheral; performing anongoing monitoring of input from the user to the multifunctionperipheral or the mobile device; performing, by the mobile data device,an ongoing comparison of user input relative to the pattern data;determining, by the mobile data device, a match between user input and apattern contained in the pattern data; initiating, by the mobile datadevice, a natural language dialog with the user in accordance withreceived status data and user-specific settings data when the mobiledata device is proximate to the multifunction peripheral and a matchbetween the user input and the pattern data is determined; receiving, bythe mobile data device, a document processing instruction from the uservia the natural language dialog with the user; communicating, by themobile data device, an operational instruction to the multifunctionperipheral in accordance with a received document processinginstruction; generating in the multifunction peripheral having anintelligent controller with a processor and associated memory, statusdata corresponding to a monitored status of the multifunctionperipheral; communicating, by the multifunction peripheral, the statusdata to the mobile data device, receiving, by the multifunctionperipheral, the operational instruction from the mobile data device, andperforming, by the multifunction peripheral, a document processingoperation in accordance with the operational instruction.
 10. The methodof claim 9 further comprising determining when the mobile data device isproximate to the multifunction peripheral in accordance with a signalreceived from an associated beacon.
 11. The method of claim 9 furthercomprising monitoring the user input comprised of voice input from theuser.
 12. The method of claim 9 further comprising monitoring the userinput comprised of tactile user interaction with the multifunctionperipheral.
 13. The method of claim 9 wherein the status data includes alevel of consumables in the multifunction peripheral.
 14. The method ofclaim 9 wherein the status data includes document processingcapabilities of the multifunction peripheral.
 15. The method of claim 9wherein the user-specific settings include preselected documentprocessing operation settings associated with the user.
 16. The methodof claim 9 further comprising generating the operational instruction inaccordance with additional user input from a sequence of data exchangessupplied in the natural language dialog.